Facebook, Twitter, and LinkedIn generate a tremendous amount of valuable social data, but how can you find out who's making connections with social media, what they’re talking about, or where they’re located? This concise and practical book shows you how to answer these questions and more. You'll learn how to combine social web data, analysis techniques, and visualization to help you find what you've been looking for in the social haystack, as well as useful information you didn't know existed.
Each standalone chapter introduces techniques for mining data in different areas of the social Web, including blogs and email. All you need to get started is a programming background and a willingness to learn basic Python tools.
- Get a straightforward synopsis of the social web landscape
- Use adaptable scripts on GitHub to harvest data from social network APIs such as Twitter, Facebook, and LinkedIn
- Learn how to employ easy-to-use Python tools to slice and dice the data you collect
- Explore social connections in microformats with the XHTML Friends Network
- Apply advanced mining techniques such as TF-IDF, cosine similarity, collocation analysis, document summarization, and clique detection
"'Let Matthew Russell serve as your guide to working with social data sets old (email, blogs) and new (Twitter, LinkedIn, Facebook). Mining the Social Web is a natural successor to Programming Collective Intelligence : a practical, hands-on approach to hacking on data from the social Web with Python."' --Jeff Hammerbacher, Chief Scientist, Cloudera
"'A rich, compact, useful, practical introduction to a galaxy of tools, techniques, and theories for exploring structured and unstructured data."' --Alex Martelli, Senior Staff Engineer, Google
Facebook, Twitter, and LinkedIn generate a tremendous amount of valuable social data, but how can you find out who's making connections with social media, what they're talking about, or where they're located? This book shows you how to answer these questions and more. Each chapter introduces techniques for mining data in different areas of the social web, including blogs and email
Introduction : hacking on Twitter data -- Microformats : semantic markup and common sense collide -- Mailboxes : oldies but goodies -- Twitter : friends, followers, and setwise operations -- Twitter : the tweet, the whole tweet, and nothing but the tweet -- LinkedIn : clustering your professional network for fun (and profit?) -- Google buzz : TF-IDF, cosine similarity, and collocations -- Blogs et al. : natural language processing (and beyond) -- Facebook : the all-in-one wonder -- The semantic web : a cocktail discussion